MonetDB/DataCell: Online Analytics in a Streaming Column-Store

نویسندگان

  • Erietta Liarou
  • Stratos Idreos
  • Stefan Manegold
  • Martin L. Kersten
چکیده

In DataCell, we design streaming functionalities in a modern relational database kernel which targets big data analytics. This includes exploitation of both its storage/execution engine and its optimizer infrastructure. We investigate the opportunities and challenges that arise with such a direction and we show that it carries significant advantages for modern applications in need for online analytics such as web logs, network monitoring and scientific data management. The major challenge then becomes the efficient support for specialized stream features, e.g., multi-query processing and incremental window-based processing as well as exploiting standard DBMS functionalities in a streaming environment such as indexing. In this demo, we present the DataCell system, an extension of the MonetDB open-source column-store for online analytics. The demo gives the user the opportunity to experience the features of DataCell such as processing both stream and persistent data and performing window based processing. The demo provides a visual interface to monitor the critical system components, e.g., how query plans transform from typical DBMS query plans to online query plans, how data flows through the query plans as the streams evolve, how DataCell maintains intermediate results in columnar form to avoid repeated evaluation of the same stream portions, etc. The demo also provides the ability to interactively set the test scenarios regarding input data and various DataCell knobs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MonetDB: Two Decades of Research in Column-oriented Database Architectures

MonetDB is a state-of-the-art open-source column-store database management system targeting applications in need for analytics over large collections of data. MonetDB is actively used nowadays in health care, in telecommunications as well as in scientific databases and in data management research, accumulating on average more than 10,000 downloads on a monthly basis. This paper gives a brief ov...

متن کامل

Single-click to Data Insights Transaction Replication and Deployment Automation Made Simple for the Cloud Age

In this report we present out initial work on making the MonetDB column-store analytical database ready for Cloud deployment. As we stand in the new space between research and industry we have tried to combine approaches from both worlds. We provide details how we utilize modern technologies and tools for automating building of virtual machine image for Cloud, datacentre and desktop use. We als...

متن کامل

DataCell: Exploiting the Power of Relational Databases for Efficient Stream Processing

Designed for complex event processing, DataCell is a research prototype database system in the area of sensor stream systems. Under development at CWI, it belongs to the MonetDB database system family. CWI researchers innovatively built a stream engine directly on top of a database kernel, thus exploiting and merging technologies from the stream world and the rich area of database literature. T...

متن کامل

A Query Language for a Data Refinery Cell

In this work we propose the DataCell, an event management system designed as a flexible data hub in an ambient environment. It provides an orthogonal extension to SQL’03, called “basket expressions”, which behave as predicate windows over multiple streams and which can be bulk processed for good resource utilization. The functionality offered by basket expressions is illustrated with numerous e...

متن کامل

A column-oriented data stream engine

This paper introduces the DataCell, a data stream management system designed as a seamless integration of continuous queries based on bulk event processing in an SQL software stack. The continuous stream queries are based on a predicate-window, called “basket” expressions, which support arbitrary complex SQL subqueries including, but not limited to, temporal and sequence constraints. The DataCe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2012